Support Whisper by chmjkb · Pull Request #45 · huggingface/optimum-executorch

chmjkb · 2025-04-04T10:18:21Z

Hi!
First of all, thanks for taking the time to build this project. I think it is a great step forward towards a greater DX with ExecuTorch!
As per discussions on ET Discord on the React Native ExecuTorch channel, this is a PR that makes it possible to export Whisper using optimum-executorch.

optimum/executorch/modeling.py

optimum/exporters/executorch/integrations.py

guangy10

@chmjkb Can you add a test for Whisper model under optimum-executorch/tests/? The test should be straightforward, we only test two things: 1) export to ET, 2) e2e task using the PTE via HF API.

Once we have the test, @michaelbenayoun can help kick off the CI

chmjkb · 2025-04-07T08:36:41Z

Hi @guangy10, I rebased and added the test, lmk if that looks good 👍🏻

HuggingFaceDocBuilderDev · 2025-04-07T10:23:30Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

guangy10 · 2025-04-07T20:50:51Z

optimum/exporters/executorch/integrations.py

+            assert encoder_input_ids.shape == torch.Size(
+                [1, 80, 3000]
+            ), f"Whisper only accepts a log-mel spectrogram of shape [1, 80, 3000], passed shape: {encoder_input_ids.shape}"


@chmjkb This is the config for whisper-tiny only, right? If I switch to a different variant of whisper, it won't work, e.g. whisper-large-v3. I think we can load this dynamically from preprocessor_config.json?
IIUC, each dim in encoder_input_ids represents [batch_size, feature_size, nb_max_frames]?

Yeah, I think whisper large is an exception and it will take in 128 features instead of 80. Will fix that. (The smaller ones should work with 80)

Actually, I'm thinking about getting rid of this assertion. Instead, I could just resolve correct shape when instantiating example_encoder_input_ids (im doing this anyways). If a user passes wrong shape for some reason, then be it.
Also, WhisperEncoder itself raises ValueError when the length of the features is not correct.
WDYT?

optimum/exporters/executorch/integrations.py

guangy10 · 2025-04-07T21:01:19Z

optimum/exporters/executorch/integrations.py

+        if isinstance(self.full_model, WhisperForConditionalGeneration):
+            dynamic_shapes = None
+        else:
+            # Define dynamic dimension for encoder output sequence length
+            encoder_seq_len_dim = torch.export.Dim("encoder_hidden_seq_length", max=self.max_hidden_seq_length)
+            dynamic_shapes = {
+                "decoder_input_ids": None,
+                "encoder_hidden_states": {1: encoder_seq_len_dim},
+                "cache_position": None,
+            }


@tugsbayasgalan @pianpwk Can we use Dim.AUTO here, to avoid setting dynamic_shapes explicitly for different models? In this case, Whisper would expect all static shapes and T5 would want to set encoder_hidden_seq_length to by dynamic at least.

Yep Dim.AuTO would be perfect here. Doing so, you don't need the if/else branching.

I tried changing the code to the following:

dynamic_shapes = { "decoder_input_ids": None, "encoder_hidden_states": {1: torch.export.Dim.AUTO}, "cache_position": None }

Unfortunately when I do that, the export fails with the following error:
RuntimeError: Cannot evaluate the shape upper bound of a dynamic-shaped tensor to a concrete bounded integer. Got tensor spec: TensorSpec(dtype=torch.float32, shape=[1, s0, 384], layout=torch.strided, is_sparse=False, shape_dynamism=1, const=False, requires_grad=True).The upper bound shape we get [1, int_oo, 384], the upper bound stride we get [int_oo, 384, 1]This tensor could either be from 1. a data-dependent operation such as nonzero. Or 2. an input, whose don't have a constraint for the upper bound.Please use export's constrain_as_size() or constrain_as_value() apis and set a concrete upper bound to resolve this.

However doing:
dynamic_shapes = {"input_ids": {1: torch.export.Dim.AUTO}}
for the Whisper encoder seems to work, however it makes the T5 export fail :D

tests/models/test_modeling_whisper.py

guangy10

Put it back to your queue. Feel free to ping me back for review on Discord or tag me here.

optimum/executorch/modeling.py

Resolve conflict

guangy10 · 2025-04-09T18:06:56Z

Re-run tests on CI

guangy10 · 2025-04-09T19:24:03Z

Looks great! Can you fix the linter by running make style? After that, we can start merging this PR if @michaelbenayoun doesn't have additional comments.

README.md

Co-authored-by: Guang Yang <42389959+guangy10@users.noreply.github.com>

guangy10 requested review from guangy10 and michaelbenayoun April 4, 2025 19:02

guangy10 reviewed Apr 4, 2025

View reviewed changes

optimum/executorch/modeling.py Outdated Show resolved Hide resolved

guangy10 reviewed Apr 4, 2025

View reviewed changes

optimum/exporters/executorch/integrations.py Show resolved Hide resolved

guangy10 requested changes Apr 4, 2025

View reviewed changes

chmjkb force-pushed the main branch from d8f47d0 to ad30d17 Compare April 7, 2025 07:39

chmjkb added 9 commits April 7, 2025 12:49

feat: add ASR task for Whisper

c5a899a

feat: modify integrations.py to make Seq2seqLM export work with Whisper

b2401ff

feat: add ExecuTorchModelForSpeechSeq2Seq to modeling.py

6e050d7

post rebase constructor fix

62800ff

docs: Update README with Whisper

a108a6f

chore: add ExecuTorchModelForSpeechSeq2Seq to __init__.py

5106c69

fix: post-rebase modelling fix

42a9cd2

tests: add Whisper testing

de1d7f8

lint: run black formatter

ba9644f

chmjkb force-pushed the main branch from 08964fa to ba9644f Compare April 7, 2025 10:49

ci: add Whisper test to CI

462acc0

guangy10 reviewed Apr 7, 2025

View reviewed changes

optimum/exporters/executorch/integrations.py Show resolved Hide resolved

guangy10 reviewed Apr 7, 2025

View reviewed changes

optimum/exporters/executorch/integrations.py Outdated Show resolved Hide resolved

guangy10 reviewed Apr 7, 2025

View reviewed changes

tests/models/test_modeling_whisper.py Outdated Show resolved Hide resolved

guangy10 requested changes Apr 7, 2025

View reviewed changes

guangy10 reviewed Apr 7, 2025

View reviewed changes

optimum/executorch/modeling.py Show resolved Hide resolved

chore: review changes

d71ae4a

chmjkb requested a review from guangy10 April 8, 2025 15:18

chmjkb added 2 commits April 8, 2025 20:49

deps: add librosa and soundfile to test dependencies

e40b0d1

deps: force numba versions other than 0.58.0

f32e7ac

chmjkb and others added 4 commits April 9, 2025 10:48

lint: run ruff check

d6c1718

Merge branch 'main' into main

33d6274

Update test_models.yml

2a3df8a

Resolve conflict

Update test_models.yml

09fb277

Update setup.py

edbb88c

guangy10 approved these changes Apr 9, 2025

View reviewed changes

guangy10 reviewed Apr 9, 2025

View reviewed changes

README.md Show resolved Hide resolved

chmjkb and others added 2 commits April 10, 2025 10:52

Update README.md

e4b053b

Co-authored-by: Guang Yang <42389959+guangy10@users.noreply.github.com>

lint: run make lint

f396043

guangy10 merged commit 5b88123 into huggingface:main Apr 10, 2025
200 of 218 checks passed

guangy10 mentioned this pull request Apr 23, 2025

Export to ExecuTorch huggingface/transformers#32253

Open

33 tasks

Conversation

chmjkb commented Apr 4, 2025

Uh oh!

Uh oh!

Uh oh!

guangy10 left a comment

Choose a reason for hiding this comment

Uh oh!

chmjkb commented Apr 7, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Apr 7, 2025

Uh oh!

guangy10 Apr 7, 2025

Choose a reason for hiding this comment

Uh oh!

chmjkb Apr 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chmjkb Apr 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

guangy10 Apr 7, 2025

Choose a reason for hiding this comment

Uh oh!

tugsbayasgalan Apr 7, 2025

Choose a reason for hiding this comment

Uh oh!

chmjkb Apr 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

guangy10 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

guangy10 commented Apr 9, 2025

Uh oh!

guangy10 commented Apr 9, 2025

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

chmjkb Apr 7, 2025 •

edited

Loading

chmjkb Apr 8, 2025 •

edited

Loading

chmjkb Apr 8, 2025 •

edited

Loading